Using deep features to create an image classifier


In [1]:
import graphlab

Load CIFAR-10 dataset

This is a popular computer vision dataset used for benchmarking. It is already split into a training and testing set.


In [2]:
image_train_url = 'https://d396qusza40orc.cloudfront.net/phoenixassets/image_train_data.csv'
image_test_url = 'https://d396qusza40orc.cloudfront.net/phoenixassets/image_test_data.csv'

In [3]:
image_train_data = graphlab.SFrame(image_train_url)
image_train_data.head()


This non-commercial license of GraphLab Create for academic use is assigned to william_gray@alumni.brown.edu and will expire on March 20, 2018.
[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: /tmp/graphlab_server_1491973526.log
Downloading https://d396qusza40orc.cloudfront.net/phoenixassets/image_train_data.csv to /var/tmp/graphlab-williamgray1/19252/8625f2a8-f20a-4077-9db1-58a605fa78f4.csv
Finished parsing file https://d396qusza40orc.cloudfront.net/phoenixassets/image_train_data.csv
Parsing completed. Parsed 100 lines in 1.18667 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,str,str,array,array]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Read 1943 lines. Lines per second: 803.644
Finished parsing file https://d396qusza40orc.cloudfront.net/phoenixassets/image_train_data.csv
Parsing completed. Parsed 2005 lines in 2.46638 secs.
Out[3]:
id image label deep_features image_array
24 Height: 32 Width: 32 bird [0.242872, 1.09545, 0.0,
0.39363, 0.0, 0.0, ...
[73.0, 77.0, 58.0, 71.0,
68.0, 50.0, 77.0, 69.0, ...
33 Height: 32 Width: 32 cat [0.525088, 0.0, 0.0, 0.0,
0.0, 0.0, 9.94829, 0.0, ...
[7.0, 5.0, 8.0, 7.0, 5.0,
8.0, 5.0, 4.0, 6.0, 7.0, ...
36 Height: 32 Width: 32 cat [0.566016, 0.0, 0.0, 0.0,
0.0, 0.0, 9.9972, 0.0, ...
[169.0, 122.0, 65.0,
131.0, 108.0, 75.0, ...
70 Height: 32 Width: 32 dog [1.1298, 0.0, 0.0,
0.778194, 0.0, 0.758051, ...
[154.0, 179.0, 152.0,
159.0, 183.0, 157.0, ...
90 Height: 32 Width: 32 bird [1.71787, 0.0, 0.0, 0.0,
0.0, 0.0, 9.33936, 0.0, ...
[216.0, 195.0, 180.0,
201.0, 178.0, 160.0, ...
97 Height: 32 Width: 32 automobile [1.57819, 0.0, 0.0, 0.0,
0.0, 0.0, 9.00632, 0.0, ...
[33.0, 44.0, 27.0, 29.0,
44.0, 31.0, 32.0, 45.0, ...
107 Height: 32 Width: 32 dog [0.0, 0.0, 0.220678, 0.0,
0.0, 0.0, 8.58053, ...
[97.0, 51.0, 31.0, 104.0,
58.0, 38.0, 107.0, 61.0, ...
121 Height: 32 Width: 32 bird [0.0, 0.237535, 0.0, 0.0,
0.0, 0.0, 9.9908, 0.0, ...
[93.0, 96.0, 88.0, 102.0,
106.0, 97.0, 117.0, ...
136 Height: 32 Width: 32 automobile [0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 7.57379, 0.0, 0.0, ...
[35.0, 59.0, 53.0, 36.0,
56.0, 56.0, 42.0, 62.0, ...
138 Height: 32 Width: 32 bird [0.658936, 0.0, 0.0, 0.0,
0.0, 0.0, 9.93748, 0.0, ...
[205.0, 193.0, 195.0,
200.0, 187.0, 193.0, ...
[10 rows x 5 columns]


In [4]:
image_test_data = graphlab.SFrame(image_test_url)
image_test_data.head()


Downloading https://d396qusza40orc.cloudfront.net/phoenixassets/image_test_data.csv to /var/tmp/graphlab-williamgray1/19252/0360416d-e3ff-46d0-8786-03b9b1658260.csv
Finished parsing file https://d396qusza40orc.cloudfront.net/phoenixassets/image_test_data.csv
Parsing completed. Parsed 100 lines in 1.20322 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,str,str,array,array]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Read 1940 lines. Lines per second: 746.417
Finished parsing file https://d396qusza40orc.cloudfront.net/phoenixassets/image_test_data.csv
Parsing completed. Parsed 4000 lines in 4.11863 secs.
Out[4]:
id image label deep_features image_array
0 Height: 32 Width: 32 cat [1.13469, 0.0, 0.0, 0.0,
0.0366498, 0.0, 9.3536, ...
[158.0, 112.0, 49.0,
159.0, 111.0, 47.0, ...
6 Height: 32 Width: 32 automobile [0.231359, 0.0, 0.0, 0.0,
0.0, 0.226023, 8.85989, ...
[160.0, 37.0, 13.0,
185.0, 49.0, 11.0, 20 ...
8 Height: 32 Width: 32 cat [0.0, 0.0, 0.0344192,
0.0, 0.0, 0.0, 11.0375, ...
[23.0, 19.0, 23.0, 19.0,
21.0, 28.0, 21.0, 16.0, ...
9 Height: 32 Width: 32 automobile [0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 11.6065, 0.0, 0.0, ...
[217.0, 215.0, 209.0,
210.0, 208.0, 202.0, ...
12 Height: 32 Width: 32 dog [0.322317, 0.0, 1.24933,
0.0, 0.0, 0.0, 9.10822, ...
[91.0, 64.0, 30.0, 82.0,
58.0, 30.0, 87.0, 73.0, ...
16 Height: 32 Width: 32 dog [0.0, 0.0, 0.347357, 0.0,
0.0, 0.0, 9.98674, 0.0, ...
[95.0, 76.0, 78.0, 92.0,
77.0, 78.0, 89.0, 77.0, ...
24 Height: 32 Width: 32 dog [1.31558, 0.0, 0.0, 0.0,
0.0, 0.0, 8.71812, 0.0, ...
[136.0, 134.0, 118.0,
142.0, 141.0, 126.0, ...
25 Height: 32 Width: 32 bird [0.0, 0.317289, 0.0,
1.36553, 0.54447, 0.0, ...
[100.0, 103.0, 74.0,
68.0, 91.0, 65.0, 116.0, ...
31 Height: 32 Width: 32 dog [0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 9.26019, 0.0, 0.0, ...
[127.0, 130.0, 81.0,
130.0, 133.0, 88.0, ...
33 Height: 32 Width: 32 dog [0.130787, 0.727667, 0.0,
0.0, 0.0, 0.0, 10.1179, ...
[118.0, 113.0, 81.0,
122.0, 117.0, 83.0, ...
[10 rows x 5 columns]

Train classifier using raw image pixels, no deep features yet

This will be compared against the model that uses deep features later on. Goal is to predict the image's label using other features.


In [5]:
raw_pixel_model = graphlab.logistic_classifier.create(image_train_data, target='label',
                                                     features=['image_array'])


PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

WARNING: The number of feature dimensions in this problem is very large in comparison with the number of examples. Unless an appropriate regularization value is set, this model may not provide accurate predictions for a validation/test set.
Logistic regression:
--------------------------------------------------------
Number of examples          : 1886
Number of classes           : 4
Number of feature columns   : 1
Number of unpacked features : 3072
Number of coefficients    : 9219
Starting L-BFGS
--------------------------------------------------------
+-----------+----------+-----------+--------------+-------------------+---------------------+
| Iteration | Passes   | Step size | Elapsed Time | Training-accuracy | Validation-accuracy |
+-----------+----------+-----------+--------------+-------------------+---------------------+
| 1         | 6        | 0.000015  | 3.697588     | 0.355779          | 0.344538            |
| 2         | 8        | 1.000000  | 4.804952     | 0.386002          | 0.378151            |
| 3         | 9        | 1.000000  | 5.503424     | 0.429480          | 0.462185            |
| 4         | 10       | 1.000000  | 6.161361     | 0.442736          | 0.478992            |
| 5         | 11       | 1.000000  | 6.819093     | 0.449629          | 0.478992            |
| 6         | 12       | 1.000000  | 7.501410     | 0.433722          | 0.428571            |
| 10        | 17       | 1.000000  | 10.485146    | 0.507423          | 0.512605            |
+-----------+----------+-----------+--------------+-------------------+---------------------+
TERMINATED: Iteration limit reached.
This model may not be optimal. To improve it, consider increasing `max_iterations`.

Predict five images with this raw pixel model


In [6]:
# actual image labels (correct answers)
image_test_data[0:5]['label']


Out[6]:
dtype: str
Rows: 5
['cat', 'automobile', 'cat', 'automobile', 'dog']

In [7]:
# model output
raw_pixel_model.predict(image_test_data[0:5])


Out[7]:
dtype: str
Rows: 5
['bird', 'cat', 'bird', 'automobile', 'dog']

Raw pixel model only got one out of five predictions correct. That's an F.

More general evaluation of the raw pixel model


In [8]:
raw_pixel_model.evaluate(image_test_data)


Out[8]:
{'accuracy': 0.47625, 'auc': 0.7203336249999999, 'confusion_matrix': Columns:
 	target_label	str
 	predicted_label	str
 	count	int
 
 Rows: 16
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |     dog      |       cat       |  147  |
 |     cat      |       dog       |  400  |
 |     dog      |       dog       |  522  |
 |     bird     |    automobile   |   97  |
 |  automobile  |    automobile   |  609  |
 |     bird     |       cat       |   93  |
 |     bird     |       dog       |  266  |
 |  automobile  |       bird      |  125  |
 |     bird     |       bird      |  544  |
 |  automobile  |       cat       |   99  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns., 'f1_score': 0.4688285324983248, 'log_loss': 1.220651965245015, 'precision': 0.48034554969626087, 'recall': 0.47625, 'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 	class	int
 
 Rows: 400004
 
 Data:
 +-----------+-----+-----+------+------+-------+
 | threshold | fpr | tpr |  p   |  n   | class |
 +-----------+-----+-----+------+------+-------+
 |    0.0    | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   1e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   2e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   3e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   4e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   5e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   6e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   7e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   8e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   9e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 +-----------+-----+-----+------+------+-------+
 [400004 rows x 6 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.}

The accuracy of this model is only 47.6%.

Next, a model leveraging deep features

This model will use transfer learning, since the dataset is so small. It will use the ImageNet dataset's deep features training, along with a simple classifier.

Computing deep features for the images

The two lines below will computer deep features. However, this is computationally intensive, so the deep features are already contained in the dataset for this reason. If they were not, I would run the two lines below.


In [9]:
# deep_learning_model = graphlab.load_model('http://s3.amazonaws.com/GraphLab-Datasets/deeplearning/imagenet_model_iter45')
# image_train_data['deep_features'] = deep_learning_model.extract_features(image_train_data)

Train a classifier using the deep features


In [10]:
deep_features_model = graphlab.logistic_classifier.create(image_train_data,
                                                         features=['deep_features'],
                                                         target='label')


PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

WARNING: The number of feature dimensions in this problem is very large in comparison with the number of examples. Unless an appropriate regularization value is set, this model may not provide accurate predictions for a validation/test set.
WARNING: Detected extremely low variance for feature(s) 'deep_features' because all entries are nearly the same.
Proceeding with model training using all features. If the model does not provide results of adequate quality, exclude the above mentioned feature(s) from the input dataset.
Logistic regression:
--------------------------------------------------------
Number of examples          : 1912
Number of classes           : 4
Number of feature columns   : 1
Number of unpacked features : 4096
Number of coefficients    : 12291
Starting L-BFGS
--------------------------------------------------------
+-----------+----------+-----------+--------------+-------------------+---------------------+
| Iteration | Passes   | Step size | Elapsed Time | Training-accuracy | Validation-accuracy |
+-----------+----------+-----------+--------------+-------------------+---------------------+
| 1         | 5        | 0.000131  | 3.364257     | 0.734310          | 0.752688            |
| 2         | 9        | 0.250000  | 6.298938     | 0.759414          | 0.774194            |
| 3         | 10       | 0.250000  | 7.280189     | 0.764121          | 0.763441            |
| 4         | 11       | 0.250000  | 8.243664     | 0.771967          | 0.774194            |
| 5         | 12       | 0.250000  | 9.221105     | 0.775628          | 0.784946            |
| 6         | 13       | 0.250000  | 10.221225    | 0.783473          | 0.784946            |
| 7         | 14       | 0.250000  | 11.168960    | 0.795502          | 0.774194            |
| 8         | 15       | 0.250000  | 12.209590    | 0.814854          | 0.774194            |
| 9         | 16       | 0.250000  | 13.169402    | 0.843619          | 0.795699            |
| 10        | 17       | 0.250000  | 14.110809    | 0.850941          | 0.784946            |
+-----------+----------+-----------+--------------+-------------------+---------------------+
TERMINATED: Iteration limit reached.
This model may not be optimal. To improve it, consider increasing `max_iterations`.

Try predicting the first five images again


In [11]:
# actual image labels (correct answers)
image_test_data[0:5]['label']


Out[11]:
dtype: str
Rows: 5
['cat', 'automobile', 'cat', 'automobile', 'dog']

In [12]:
# model output
deep_features_model.predict(image_test_data[0:5])


Out[12]:
dtype: str
Rows: 5
['cat', 'automobile', 'cat', 'automobile', 'dog']

It got them all correct! A+.

More general evaluation of the model, similar to evalution of the raw pixels model


In [13]:
deep_features_model.evaluate(image_test_data)


Out[13]:
{'accuracy': 0.78025, 'auc': 0.937662249999998, 'confusion_matrix': Columns:
 	target_label	str
 	predicted_label	str
 	count	int
 
 Rows: 16
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |  automobile  |       cat       |   11  |
 |     dog      |       cat       |  211  |
 |  automobile  |       dog       |   5   |
 |     cat      |       bird      |   90  |
 |     bird     |       dog       |   51  |
 |     dog      |       bird      |   58  |
 |     cat      |    automobile   |   51  |
 |     bird     |       cat       |  112  |
 |     dog      |    automobile   |   21  |
 |     dog      |       dog       |  710  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns., 'f1_score': 0.7788080106173169, 'log_loss': 0.5723048751210497, 'precision': 0.777826311332351, 'recall': 0.78025, 'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 	class	int
 
 Rows: 400004
 
 Data:
 +-----------+----------------+-----+------+------+-------+
 | threshold |      fpr       | tpr |  p   |  n   | class |
 +-----------+----------------+-----+------+------+-------+
 |    0.0    |      1.0       | 1.0 | 1000 | 3000 |   0   |
 |   1e-05   |     0.981      | 1.0 | 1000 | 3000 |   0   |
 |   2e-05   | 0.976333333333 | 1.0 | 1000 | 3000 |   0   |
 |   3e-05   |     0.974      | 1.0 | 1000 | 3000 |   0   |
 |   4e-05   | 0.971333333333 | 1.0 | 1000 | 3000 |   0   |
 |   5e-05   |     0.968      | 1.0 | 1000 | 3000 |   0   |
 |   6e-05   | 0.964666666667 | 1.0 | 1000 | 3000 |   0   |
 |   7e-05   | 0.961666666667 | 1.0 | 1000 | 3000 |   0   |
 |   8e-05   |      0.96      | 1.0 | 1000 | 3000 |   0   |
 |   9e-05   | 0.957333333333 | 1.0 | 1000 | 3000 |   0   |
 +-----------+----------------+-----+------+------+-------+
 [400004 rows x 6 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.}

Accuracy is 79%!


In [ ]: